Note: There are often multiple ways to answer each question.

For the following questions, explain what went wrong and how the program can be fixed.

Load these packages before starting the problems:

library(ggplot2)
library(dplyr)
  1. We want to add x and y together to get the value of 24:
x <- 14
y <- "10"
x + y
## Error in x + y: non-numeric argument to binary operator

We cannot add a numeric variable and string together in this way. We first need to “coerce” y into a numeric variable:

x <- 14
y <- "10"
x + as.numeric(y)
## [1] 24
  1. We want to add 1+2 and divide it by 3+4:
((1+2)/(3+4)))
## Error: <text>:1:14: unexpected ')'
## 1: ((1+2)/(3+4)))
##                  ^

There is one too many ) at the end of the line. We can fix this by removing it:

((1+2)/(3+4))
## [1] 0.4285714

For the rest of the questions, we will use the mtcars dataset:

data(mtcars)
  1. We want to save the vector of numbers 1, 2, …, L-1 into the variable x, where L is the number of columns in mtcars.
x <- 1:ncol(mtcars)-1
x
##  [1]  0  1  2  3  4  5  6  7  8  9 10

The : takes precedence over -, meaning that it is evaluated first. Thus, the right hand side is equivalent to c(1, 2, ..., ncol(mtcars)) - 1. Whenever we have a vector minus a single number, that number is subtracted from each element.

We can fix this by inserting parentheses:

x <- 1:(ncol(mtcars)-1)
x
##  [1]  1  2  3  4  5  6  7  8  9 10
  1. We want to make a scatterplot of mpg vs. wt:
ggplot(data = mtcars) +
    geom_point(y = mpg, x = wt)
## Error in layer(data = data, mapping = mapping, stat = stat, geom = GeomPoint, : object 'wt' not found

We forgot wrap the stuff in geom_point with aes(...). Because of that, R thinks we want x to match a wt variable, but it cannot find a variable named wt in our environment. Fix:

ggplot(data = mtcars) +
    geom_point(aes(y = mpg, x = wt))

  1. We want to make a histogram of mpg:
ggplot(data = mtcars)
    + geom_histogram(aes(x = mpg))

## Error: Cannot use `+.gg()` with a single argument. Did you accidentally put + on a new line?

The error message tells us what went wrong. We can fix it by moving the + to the end of the previous line:

ggplot(data = mtcars) +
    geom_histogram(aes(x = mpg))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

  1. We want to make a boxplot of mpg for each value of cyl, overlay it with points, and add a title to the plot:
ggplot(data = mtcars, aes(x = cyl, y = mpg)) +
    geom_boxplot() +
    geom_point()
    labs(title = "Plot of mpg vs. cyl")
## Warning: Continuous x aesthetic -- did you forget aes(group=...)?

## $title
## [1] "Plot of mpg vs. cyl"
## 
## attr(,"class")
## [1] "labels"

There are 2 problems here. First, we only get one boxplot instead of three (cyl can take on value of 4, 6 or 8). This is because cyl is a continuous variable, and as the warning suggests, we can add group = cyl in aes(..). Second, there is no title for the plot because we forgot to add a + after the third line of code. Fix:

ggplot(data = mtcars, aes(x = cyl, y = mpg, group = cyl)) +
    geom_boxplot() +
    geom_point() +
    labs(title = "Plot of mpg vs. cyl")

  1. We want a scatterplot of qsec vs. wt, but we want all the points to be colored blue:
ggplot(data = mtcars, aes(y = qsec, x = wt)) +
    geom_point(aes(col = "blue"))

If we want the color of the points NOT to be data-dependent, then it should not go into the aes call:

ggplot(data = mtcars, aes(y = qsec, x = wt)) +
    geom_point(col = "blue")

  1. We want to create a new column which is miles per quart and display the first 3 rows:
mtcars %>% mutate(miles per quart = mpg / 4) %>% head(n = 3)
## Error: <text>:1:25: unexpected symbol
## 1: mtcars %>% mutate(miles per
##                             ^

If we want column names to have spaces, we need to surround the new column name with backticks:

mtcars %>% mutate(`miles per quart` = mpg / 4) %>% head(n = 3)
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb miles per quart
## 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4            5.25
## 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4            5.25
## 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1            5.70
  1. We want to compute the mean mpg for each value of gear:
mtcars %>% group_by(gear) %>% summarize(mean = mean)
## Error: Column `mean` is of unsupported type function

We forgot to say that we should be taking the mean of mpg:

mtcars %>% group_by(gear) %>% summarize(mean = mean(mpg))
## # A tibble: 3 x 2
##    gear  mean
##   <dbl> <dbl>
## 1     3  16.1
## 2     4  24.5
## 3     5  21.4
  1. We want to compute the maximum hp and disp in the dataset:
mtcars %>% summarize(max = max(hp, disp))
##   max
## 1 472

The code above looks for the single maximum across the 2 columns hp and disp and returns just that value. To compute the maximum in each column, we must have them in separate calls:

mtcars %>% summarize(max_hp = max(hp),
                      max_disp = max(disp))
##   max_hp max_disp
## 1    335      472